个性化新闻推荐旨在通过预测他们点击某些文章的可能性为读者提供有吸引力的文章。为了准确预测这种概率,已经提出了充足的研究,以积极利用物品的内容特征,例如单词,类别或实体。然而,我们观察到,文章的语境特征,例如CTR(点击率),流行度或新鲜度,最近被忽视或未充分利用。为了证明这是这种情况,我们在近期深度学习模型和天真的上下文模型之间进行了广泛的比较,我们设计得令人惊讶地发现后者很容易表现前者。此外,我们的分析表明,近期将过度复杂的深度学习业务应用于上下文功能的趋势实际上妨碍了推荐性能。根据这些知识,我们设计了一个有目的的简单上下文模块,可以通过大的边距提高上一个新闻推荐模型。
translated by 谷歌翻译
Monocular depth estimation is a challenging problem on which deep neural networks have demonstrated great potential. However, depth maps predicted by existing deep models usually lack fine-grained details due to the convolution operations and the down-samplings in networks. We find that increasing input resolution is helpful to preserve more local details while the estimation at low resolution is more accurate globally. Therefore, we propose a novel depth map fusion module to combine the advantages of estimations with multi-resolution inputs. Instead of merging the low- and high-resolution estimations equally, we adopt the core idea of Poisson fusion, trying to implant the gradient domain of high-resolution depth into the low-resolution depth. While classic Poisson fusion requires a fusion mask as supervision, we propose a self-supervised framework based on guided image filtering. We demonstrate that this gradient-based composition performs much better at noisy immunity, compared with the state-of-the-art depth map fusion method. Our lightweight depth fusion is one-shot and runs in real-time, making our method 80X faster than a state-of-the-art depth fusion method. Quantitative evaluations demonstrate that the proposed method can be integrated into many fully convolutional monocular depth estimation backbones with a significant performance boost, leading to state-of-the-art results of detail enhancement on depth maps.
translated by 谷歌翻译
Machine Learning (ML) interatomic models and potentials have been widely employed in simulations of materials. Long-range interactions often dominate in some ionic systems whose dynamics behavior is significantly influenced. However, the long-range effect such as Coulomb and Van der Wales potential is not considered in most ML interatomic potentials. To address this issue, we put forward a method that can take long-range effects into account for most ML local interatomic models with the reciprocal space neural network. The structure information in real space is firstly transformed into reciprocal space and then encoded into a reciprocal space potential or a global descriptor with full atomic interactions. The reciprocal space potential and descriptor keep full invariance of Euclidean symmetry and choice of the cell. Benefiting from the reciprocal-space information, ML interatomic models can be extended to describe the long-range potential including not only Coulomb but any other long-range interaction. A model NaCl system considering Coulomb interaction and the GaxNy system with defects are applied to illustrate the advantage of our approach. At the same time, our approach helps to improve the prediction accuracy of some global properties such as the band gap where the full atomic interaction beyond local atomic environments plays a very important role. In summary, our work has expanded the ability of current ML interatomic models and potentials when dealing with the long-range effect, hence paving a new way for accurate prediction of global properties and large-scale dynamic simulations of systems with defects.
translated by 谷歌翻译
Spatial-temporal (ST) graph modeling, such as traffic speed forecasting and taxi demand prediction, is an important task in deep learning area. However, for the nodes in graph, their ST patterns can vary greatly in difficulties for modeling, owning to the heterogeneous nature of ST data. We argue that unveiling the nodes to the model in a meaningful order, from easy to complex, can provide performance improvements over traditional training procedure. The idea has its root in Curriculum Learning which suggests in the early stage of training models can be sensitive to noise and difficult samples. In this paper, we propose ST-Curriculum Dropout, a novel and easy-to-implement strategy for spatial-temporal graph modeling. Specifically, we evaluate the learning difficulty of each node in high-level feature space and drop those difficult ones out to ensure the model only needs to handle fundamental ST relations at the beginning, before gradually moving to hard ones. Our strategy can be applied to any canonical deep learning architecture without extra trainable parameters, and extensive experiments on a wide range of datasets are conducted to illustrate that, by controlling the difficulty level of ST relations as the training progresses, the model is able to capture better representation of the data and thus yields better generalization.
translated by 谷歌翻译
This work presents Time-reversal Equivariant Neural Network (TENN) framework. With TENN, the time-reversal symmetry is considered in the equivariant neural network (ENN), which generalizes the ENN to consider physical quantities related to time-reversal symmetry such as spin and velocity of atoms. TENN-e3, as the time-reversal-extension of E(3) equivariant neural network, is developed to keep the Time-reversal E(3) equivariant with consideration of whether to include the spin-orbit effect for both collinear and non-collinear magnetic moments situations for magnetic material. TENN-e3 can construct spin neural network potential and the Hamiltonian of magnetic material from ab-initio calculations. Time-reversal-E(3)-equivariant convolutions for interactions of spinor and geometric tensors are employed in TENN-e3. Compared to the popular ENN, TENN-e3 can describe the complex spin-lattice coupling with high accuracy and keep time-reversal symmetry which is not preserved in the existing E(3)-equivariant model. Also, the Hamiltonian of magnetic material with time-reversal symmetry can be built with TENN-e3. TENN paves a new way to spin-lattice dynamics simulations over long-time scales and electronic structure calculations of large-scale magnetic materials.
translated by 谷歌翻译
Mixup is a popular data augmentation technique based on creating new samples by linear interpolation between two given data samples, to improve both the generalization and robustness of the trained model. Knowledge distillation (KD), on the other hand, is widely used for model compression and transfer learning, which involves using a larger network's implicit knowledge to guide the learning of a smaller network. At first glance, these two techniques seem very different, however, we found that ``smoothness" is the connecting link between the two and is also a crucial attribute in understanding KD's interplay with mixup. Although many mixup variants and distillation methods have been proposed, much remains to be understood regarding the role of a mixup in knowledge distillation. In this paper, we present a detailed empirical study on various important dimensions of compatibility between mixup and knowledge distillation. We also scrutinize the behavior of the networks trained with a mixup in the light of knowledge distillation through extensive analysis, visualizations, and comprehensive experiments on image classification. Finally, based on our findings, we suggest improved strategies to guide the student network to enhance its effectiveness. Additionally, the findings of this study provide insightful suggestions to researchers and practitioners that commonly use techniques from KD. Our code is available at https://github.com/hchoi71/MIX-KD.
translated by 谷歌翻译
磁共振图像的降解有益于提高低信噪比图像的质量。最近,使用深层神经网络进行DENOSING表现出了令人鼓舞的结果。但是,这些网络大多数都利用监督学习,这需要大量的噪声和清洁图像对的培训图像。获得训练图像,尤其是干净的图像,既昂贵又耗时。因此,已经开发了仅需要成对噪声浪费图像的噪声2Noise(N2N)之类的方法来减轻获得训练数据集的负担。在这项研究中,我们提出了一种新的自我监督的denoising方法Coil2Coil(C2C),该方法不需要获取干净的图像或配对的噪声浪费图像进行训练。取而代之的是,该方法利用了从分阶段阵列线圈中的多通道数据来生成训练图像。首先,它将多通道线圈图像分为两个图像,一个用于输入,另一个用于标签。然后,它们被处理以施加噪声独立性和敏感性归一化,以便它们可用于N2N的训练图像。为了推断,该方法输入了一个线圈组合的图像(例如DICOM图像),从而允许该方法的广泛应用。当使用合成噪声添加的图像进行评估时,C2C对几种自我监督方法显示了最佳性能,从而报告了与监督方法的可比结果。在测试DICOM图像时,C2C成功地将真实噪声降低,而没有显示误差图中的结构依赖性残差。由于不需要对清洁或配对图像进行额外扫描的显着优势,因此可以轻松地用于各种临床应用。
translated by 谷歌翻译
伤口图像分割是伤口临床诊断和时间治疗的关键成分。最近,深度学习已成为伤口图像分割的主流方法。但是,在训练阶段之前,需要进行伤口图像的预处理,例如照明校正,因为可以大大提高性能。校正程序和深层模型的训练是彼此独立的,这导致了次优的分割性能,因为固定的照明校正可能不适合所有图像。为了解决上述问题,本文提出了一种端到端的双视分段方法,通过将可学习的照明校正模块纳入深度细分模型中。可以在训练阶段自动学习和更新模块的参数,而双视融合可以完全利用RAW图像和增强图像的功能。为了证明拟议框架的有效性和鲁棒性,在基准数据集上进行了广泛的实验。令人鼓舞的结果表明,与最先进的方法相比,我们的框架可以显着改善细分性能。
translated by 谷歌翻译
深度学习大大提高了单眼深度估计(MDE)的性能,这是完全基于视觉的自主驾驶(AD)系统(例如特斯拉和丰田)的关键组成部分。在这项工作中,我们对基于学习的MDE产生了攻击。特别是,我们使用基于优化的方法系统地生成隐形的物理对象贴片来攻击深度估计。我们通过面向对象的对抗设计,敏感的区域定位和自然风格的伪装来平衡攻击的隐身和有效性。使用现实世界的驾驶场景,我们评估了对并发MDE模型的攻击和AD的代表下游任务(即3D对象检测)。实验结果表明,我们的方法可以为不同的目标对象和模型生成隐形,有效和健壮的对抗贴片,并在物体检测中以1/1/的斑点检测到超过6米的平均深度估计误差和93%的攻击成功率(ASR)车辆后部9个。具有实际车辆的三个不同驾驶路线上的现场测试表明,在连续视频帧中,我们导致超过6米的平均深度估计误差,并将对象检测率从90.70%降低到5.16%。
translated by 谷歌翻译
估计路径的旅行时间是智能运输系统的重要主题。它是现实世界应用的基础,例如交通监控,路线计划和出租车派遣。但是,为这样的数据驱动任务构建模型需要大量用户的旅行信息,这与其隐私直接相关,因此不太可能共享。数据所有者之间的非独立和相同分布的(非IID)轨迹数据也使一个预测模型变得极具挑战性,如果我们直接应用联合学习。最后,以前关于旅行时间估算的工作并未考虑道路的实时交通状态,我们认为这可以极大地影响预测。为了应对上述挑战,我们为移动用户组引入GOF-TTE,生成的在线联合学习框架以进行旅行时间估计,这是我)使用联合学习方法,允许在培训时将私人数据保存在客户端设备上,并设计设计和设计。所有客户共享的全球模型作为在线生成模型推断实时道路交通状态。 ii)除了在服务器上共享基本模型外,还针对每个客户调整了一个微调的个性化模型来研究其个人驾驶习惯,从而弥补了本地化全球模型预测的残余错误。 %iii)将全球模型设计为所有客户共享的在线生成模型,以推断实时道路交通状态。我们还对我们的框架采用了简单的隐私攻击,并实施了差异隐私机制,以进一步保证隐私安全。最后,我们对Didi Chengdu和Xi'an的两个现实世界公共出租车数据集进行了实验。实验结果证明了我们提出的框架的有效性。
translated by 谷歌翻译